9 research outputs found

    Exploiting Conceptual Modeling for Searching Genomic Metadata: A Quantitative and Qualitative Empirical Study

    Get PDF
    Providing a common data model for the metadata of several heterogenous genomic data sources is hard, as they do not share any standard or agreed practice for metadata description. Two years ago we managed to discover a subset of common metadata present in most sources and to organize it as a smart genomic conceptual model (GCM); the model has been instrumental to our efforts in the development of a major software pipeline for data integration. More recently, we developed a user-friendly search interface, based on a simplified version of GCM. In this paper, we report our evaluation of the effectiveness of this new user interface. Specifically, we present the results of a compendious empirical study to answer the research question: How much is such a simple interface well-understood by a standard user? The target of this study is a mixed population, composed by biologists, bioinformaticians and computer scientists. The result of our empirical study shows that the users were successful in producing search queries starting from their natural language description, as they did it with good accuracy and small error rate. The study also shows that most users were generally satisfied; it provides indications on how to improve our search system and how to continue our effort in integration of genomic sources. We are consequently adapting the user interface, that will be soon opened to public use

    From a Conceptual Model to a Knowledge Graph for Genomic Datasets

    Get PDF
    Data access at genomic repositories is problematic, as data is described by heterogeneous and hardly comparable metadata. We previously introduced a unified conceptual schema, collected metadata in a single repository and provided classical search methods upon them. We here propose a new paradigm to support semantic search of integrated genomic metadata, based on the Genomic Knowledge Graph, a semantic graph of genomic terms and concepts, which combines the original information provided by each source with curated terminological content from specialized ontologies. Commercial knowledge-assisted search is designed for transparently supporting keyword-based search without explaining inferences; in biology, inference understanding is instead critical. For this reason, we propose a graph-based visual search for data exploration; some expert users can navigate the semantic graph along the conceptual schema, enriched with simple forms of homonyms and term hierarchies, thus understanding the semantic reasoning behind query results

    Towards Designing Conceptual Data Models for Big Data Warehouses: The Genomics Case

    Full text link
    [EN] Data Warehousing applied in Big Data contexts has been an emergent topic of research, as traditional Data Warehousing technologies are unable to deal with Big Data characteristics and challenges. The methods used in this field are already well systematized and adopted by practitioners, while research in Big Data Warehousing is only starting to provide some guidance on how to model such complex systems. This work contributes to the process of designing conceptual data models for Big Data Warehouses proposing a method based on rules and design patterns, which aims to gather the information of a certain application domain mapped in a relational conceptual model. A complex domain that can benefit from this work is Genomics, characterized by an increasing heterogeneity, both in terms of content and data structure. Moreover, the challenges for collecting and analyzing genome data under a unified perspective have become a bottleneck for the scientific community, reason why standardized analytical repositories such as a Big Genome Warehouse can be of high value to the community. In the demonstration case presented here, a genomics relational model is merged with the proposed Big Data Warehouse Conceptual Metamodel to obtain the Big Genome Warehouse Conceptual Model, showing that the design rules and patterns can be applied having a relational conceptual model as starting point.This work has been supported by FCT - Fundação para a CiĂȘn-cia e Tecnologia within the Project Scope: UID/CEC/00319/2019, the Doctoral scholarship PD/BDE/135100/2017 and European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project nÂș 039479; Funding Reference: POCI-01-0247-FEDER-039479]. We also thank both the Spanish State Research Agency and the Generalitat Valenciana under the projects DataME TIN2016-80811-P, ACIF/2018/171, and PROMETEO/2018/176. Icons made by Freepik, from www.flaticon.com.GalvĂŁo, J.; LeĂłn-Palacio, A.; Costa, C.; Santos, MY.; Pastor LĂłpez, O. (2020). Towards Designing Conceptual Data Models for Big Data Warehouses: The Genomics Case. Springer Nature. 3-19. https://doi.org/10.1007/978-3-030-63396-7_1S319Krishnan, K.: Data Warehousing in the Age of Big Data. Morgan Kaufmann is an imprint of Elsevier, Amsterdam (2013)Santos, M.Y., Costa, C.: Big Data: Concepts, Warehousing and Analytics. River Publishers, Aalborg (2020)Cuzzocrea, A., Moussa, R.: Multidimensional database modeling: literature survey and research agenda in the big data era. In: 2017 International Symposium on Networks, Computers and Communications (ISNCC), pp. 1–6 (2017)Di Tria, F., Lefons, E., Tangorra, F.: Design process for big data warehouses. In: 2014 International Conference on Data Science and Advanced Analytics (DSAA), pp. 512–518. IEEE (2014)Dehdouh, K., Bentayeb, F., Boussaid, O., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (2015)BĂ©zivin, J.: On the unification power of models. Softw. Syst. Model. 4(2), 171–188 (2005). https://doi.org/10.1007/s10270-005-0079-0Reyes RomĂĄn, J.F., Pastor, Ó., Casamayor, J.C., Valverde, F.: Applying conceptual modeling to better understand the human genome. In: Comyn-Wattiau, I., Tanaka, K., Song, I.-Y., Yamamoto, S., Saeki, M. (eds.) ER 2016. LNCS, vol. 9974, pp. 404–412. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46397-1_31Embley, D.W., Liddle, S.W.: Big data—conceptual modeling to the rescue. In: Ng, W., Storey, V.C., Trujillo, J.C. (eds.) ER 2013. LNCS, vol. 8217, pp. 1–8. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41924-9_1Giebler, C., Gröger, C., Hoos, E., Schwarz, H., Mitschang, B.: Modeling data lakes with data vault: practical experiences, assessment, and lessons learned. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 63–77. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_7Gil, D., Song, I.-Y.: Modeling and management of big data: challenges and opportunities. Future Gener. Comput. Syst. 63, 96–99 (2016)Di Tria, F., Lefons, E., Tangorra, F.: GrHyMM: a graph-oriented hybrid multidimensional model. In: De Troyer, O., Bauzer Medeiros, C., Billen, R., Hallot, P., Simitsis, A., Van Mingroot, H. (eds.) ER 2011. LNCS, vol. 6999, pp. 86–97. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24574-9_12Santos, M.Y., Costa, C.: Data warehousing in big data: from multidimensional to tabular data models. In: Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering, pp. 51–60. ACM, New York (2016)Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling. Wiley, Hoboken (2013)Costa, C., Santos, M.Y.: Evaluating several design patterns and trends in big data warehousing systems. In: Krogstie, J., Reijers, H.A. (eds.) CAiSE 2018. LNCS, vol. 10816, pp. 459–473. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91563-0_28Santos, M.Y., Costa, C., GalvĂŁo, J., Andrade, C., Pastor, O., MarcĂ©n, A.C.: Big data warehousing for efficient, integrated and advanced analytics - visionary paper. In: Cappiello, C., Ruiz, M. (eds.) CAiSE 2019. LNBIP, vol. 350, pp. 215–226. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21297-1_1

    Empowering Virus Sequence Research Through Conceptual Modeling

    Get PDF
    The pandemic outbreak of the coronavirus disease has attracted attention towards the genetic mechanisms of viruses. We hereby present the Viral Conceptual Model (VCM), centered on the virus sequence and described from four perspectives: biological (virus type and hosts/sample), analytical (annotations, nucleotide and amino acid variants), organizational (sequencing project) and technical (experimental technology). VCM is inspired by GCM, our previously developed Genomic Conceptual Model, but it introduces many novel concepts, as viral sequences significantly differ from human genomes. When applied to SARS-CoV-2 virus, complex conceptual queries upon VCM are able to replicate the search results of recent articles, hence demonstrating huge potential in supporting virology research. Our effort is part of a broad vision: availability of conceptual models for both human genomics and viruses will provide important opportunities for research, especially if interconnected by the same human being, playing the role of virus host as well as provider of genomic and phenotype information

    Influence of recent immobilization or surgery on mortality in cancer patients with venous thromboembolism

    No full text
    BACKGROUND: The influence of recent immobilization or surgery on mortality in cancer patients with venous thromboembolism (VTE) has not been thoroughly studied. METHODS: We used the RIETE Registry data to compare the 3-month mortality rate in cancer patients with VTE, with patients categorized according to the presence of recent immobilization, surgery or neither. The major outcomes were fatal pulmonary embolism (PE) and fatal bleeding within the first 3 months. RESULTS: Of 6,746 patients with active cancer and acute VTE, 1,224 (18%) had recent immobilization, 1,055 (16%) recent surgery, and 4,467 (66%) had neither. The all-cause mortality was 23.4% (95% CI: 22.4-24.5), and the PE-related mortality: 2.5% (95% CI: 2.1-2.9). Four in every ten patients dying of PE had recent immobilization (37%) or surgery (5.4%). Only 28% of patients with immobilization had received prophylaxis, as compared with 67% of the surgical. Fatal PE was more common in patients with recent immobilization (5.0%; 95% CI: 3.9-6.3) than in those with surgery (0.8%; 95% CI: 0.4-1.6) or neither (2.2%; 95% CI: 1.8-2.6). On multivariate analysis, patients with immobilization were at an increased risk for fatal PE (odds ratio: 1.8; 95% CI: 1.2-2.5). CONCLUSIONS: One in every three cancer patients dying of PE had recent immobilization for ≄ 4 days. Many of these deaths could have been prevented with adequate thromboprophylaxis

    The interest of the Spanish network of investigators in back pain for rehabilitation physician

    No full text
    Background: The Spanish Back Pain Research Network (REIDE) brings together teams of researchers and clinicians who are interested in nonspecific neck and back pain (BP). Its objective is to improve the efficacy, safety, effectiveness, and efficiency of the clinical management of BP. Method: The Network welcomes clinicians and researchers interested in BP. The only requirement to become a member of REIDE is to take part in one of its research projects, and any member can propose a new one. The Network supports those projects that are of interest to two or more groups by assuming their administration and management, which allows the researchers to focus on their task. Its working method ensures methodological quality, a multidisciplinary approach, and the clinical relevance of those projects that are carried out. Results: 179 researchers from 11 areas in Spain are involved in REIDE, including experts in all of the relevant fields of BP research. Most Spanish studies on BP that have been published in international scientific journals come from the teams involved in REIDE, and it currently has 13 ongoing research projects. Conclusions: The Network can help to enhance research among rehabilitation specialists who are interested in BP, and can contribute to the development of research projects which are of interest to the specialty. © 2005 Sociedad Española de Rehabilitación y Medicina Física (SERMEF) y Elsevier España, S.L

    Microalgal Biomass of Industrial Interest: Methods of Characterization

    No full text
    International audienceMicroalgae represent a new source of biomass for many applications. The advantage of microalgae over higher plants is their high productivities. The photoautotrophic microalgae include all photosynthetic microorganisms, i.e. Cyanobacteria (prokaryotes) or microalgae (eukaryotes). These microorganisms are characterized by a large biodiversity and chimiodiversity. Then, the analysis of microalgal and cyanobacterial biomass often needs specific adaptations of the classical protocols for extraction as well as for quantification of their contents. This chapter reviewed the main analytical methods used for the analysis of microalgae biomass and its main vaporizable compounds: proteins, polysaccharides, lipids, pigments and secondary metabolites

    A second update on mapping the human genetic architecture of COVID-19

    Get PDF

    Penumbral imaging and functional outcome in patients with anterior circulation ischaemic stroke treated with endovascular thrombectomy versus medical therapy: a meta-analysis of individual patient-level data

    No full text
    corecore